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1. INTRODUCTION 

Loss of the ability to communicate with others can have devastating effects. Sign language is a way 
for communication to overcome this problem. Sign language involves gestures made by hands and facial 
expressions. It is a very effective and interactive way of communicating [1]. The problem that arises from sign 
language is that not everyone is familiar with the gestures. It would be challenging for people with disability 
to communicate with people who don’t know this language. Furthermore, it can’t be used to communicate 
digitally. Another issue with the sign language is that there is no universal sign language. Every country has 
their own sign language with gestures that have different meaning than the sign language of other countries. 

To overcome the issue of communication, a glove has been proposed in this study called GloSign. 
The major focus is to translate the sign language into English language. This paper focuses on American sign 
language, as it is the most common sign language. American sign language is mostly used in America and 
some parts of Canada. American sign language was devised in the 19" century by the American school of deaf. 
Like any other language, sign language has formal and informal parts. This paper covers the formal part of the 
communication. The formal part of the American sign language consists of 26 alphabets. These alphabets can 
be used to form words and sentences. The gestures associated with these letters are defined by four components. 
These are the shape of the hand, position in relation to the body, hand movements and alignment of the palm. 
Few of the gestures are dynamic. These gestures require the movement of the hand. Figure 1 shows the basic 
gestures for the English alphabets in American sign language. 

This glove consists of flex sensor, accelerometer, and gyroscope to aid in recognizing gestures made. 
This wireless glove uses an IoT platform for uploading the data. This data is analyzed to understand the gesture 
made using the glove. Then it will be used to form words and sentences. The sentences will then be displayed 
on the screen running the gesture recognition software, along with conversion of the sentence to speech. 
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Figure 1. American sign language 


This paper is divided into four sections. The first section is the literature review followed by 
methodology followed by results and the last section is the discussion. The fourth section concludes the paper. 


2. LITERATURE REVIEW 

The literature reviewed here is divided by the types of sensors, communication, algorithms that is 
used and how the gestures are outputted. In [2]-[6], the authors use a glove consisting of flex sensors. The 
gestures recorded by these sensors were displayed on the LCD. This glove was able to predict all the alphabets 
with a decent accuracy. The focus of this project was to develop a cheap glove to assist disabled people in 
communicating. The authors of [7], [8] developed a wireless gesture decoder that used flex sensors and 
accelerometer to determine the gestures. These gestures were sent to the mobile application using Bluetooth. 
The glove was able to determine all the alphabets and 15 words with an accuracy of 95%. In [8], a wireless 
glove was developed that consisted of the flex sensors and inertial unit. The glove was able to communicate 
using an Andriod application through Bluetooth. The glove achieved an accuracy of 98.2% when pressure 
sensors were added to the system. Similar glove was designed using flex sensors. This system achieved an 
accuracy of 83%. The system was able to determine the gestures using the voltage levels from flex sensors. 
After determining the gesture, it was displayed on a phone or laptop using Bluetooth communication. 

The authors in [9]—[12] used flex sensors and accelerometer to identify the gestures made by the glove. 
It took around 0.74s to convert the gesture to sound and text. The glove was able to convert basic words to text 
and speech. The speech was stored in the card and played when the sign was made. 

Tanyawiwat and Thiemjarus [13] made a cheap portable glove for gesture recognition known as 
GesTALK. It would convert the static gestures to speech. The system was able to work with American sign 
language (ASL) and Pakistan sign language (PSL). The glove was able to achieve an accuracy of 90%. Another 
glove in [14] had contact sensors placed along with flex sensors. This glove was able to convert gestures from 
8 sign languages to text. It achieved an accuracy of 93.16%. El-Din and El-Ghany [15] used a glove with flex 
sensors and inertial sensor to determine the gestures made by the glove. The system was able to achieve an 
accuracy of 88% with dynamic gestures. It was able to convert gestures from 2 sign languages, American sign 
language (ASL) and Arabic sign language (ArSL) using a python graphical user interface (GUI) program. 

Tanyawiwat and Thiemjarus [13] added extra sensors such as Touch sensors to the glove. To improve 
the accuracy, the glove data was passed through multivariable Gaussian distribution and multi-objective 
Bayesian framework for feature selection. The major problem was the ambiguity of the gestures that caused 
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error in determining the letters. Another paper [16], also used contact sensors to aid in determination of 
gestures. It utilized a k-nearest neighbors (KNN) algorithm to increase the efficiency of the system to 91.54%. 
Ahmed ef al. [17] and Arif et al. [18] designed a glove with contact sensors, flex sensors and inertial sensor. 
This framework achieved an accuracy of 92% using a gesture recognition algorithm. This gesture recognition 
algorithm used the readings from the contact sensor to determine the gesture. After the algorithm generated the 
possible alphabets, it would corroborate them with the flex sensor readings to make the decision more precise. 
At the end, values from inertial unit were used to finalize the alphabets. 

Wu et al. [19] used surface Electromyography (EMG) sensors along with accelerometer to determine 
the gestures made. After getting the readings, the data was passed through multiple classifiers to get more 
accurate responses. The system was able to achieve an accuracy of 96% on 80 gestures. Abhishek ef al. [20], 
the authors used capacitive touch sensors to help in determination of the gestures. The system was able to 
determine gestures in 0.7s with an accuracy of 92% using a python code. Mehdi and Khan [21] used a 7-sensor 
glove from 5DT company. It consists of a tilt sensor to determine the rotation of the glove. The data from the 
glove was passed through three-layer algorithm. The three-layer neural network used to assist in finding the 
alphabets. The first layer consisted of raw sensor values which are passed to a hidden layer with 52 nodes. The 
third layer consisted of 26 nodes, each associated with an alphabet character. This algorithm achieved an 
accuracy of 88% in determining the gestures. The paper [22], [23] uses immersion’s 18 sensor CyberGlove 
which consists of resistive bend, abduction and flexion measuring sensors. This framework gets the raw data 
from the sensors and passes it to a neural network. This system was able to achieve an accuracy of 90%, but 
the major drawback of this framework was that it was not real-time. 

In this paper, we propose a glove that can translate gestures into alphabets. This glove is: 

— Wireless and portable [24] 

— Real-time 

— Able to form words and sentences [25] 
— Accessible anywhere using IoT platform 

The technological developments of the present era have paved the way for state-of-the-art and 
competent solutions to developing problems. The literature review on medical gloves emphasizes the features 
and limitations of the several gloves available in the market. However, the glove under consideration stands 
out as it is designed to cover the gaps left by earlier gloves. This glove offers real-time results, which is a 
significant benefit in situations where prompt action is obligatory. These results have been demonstrated in the 
result section. Its convenience and ease of access are other features that make it a convenient option for 
healthcare experts. The ability to address the failings of earlier gloves makes this glove an advanced solution 
that can contribute to enhancing the quality of patient treatment. 


3. METHOD 

The transformation of sign language into English language using the GloSign glove involves multiple 
stages that require careful attention to detail. Firstly, the sensors on the glove must be selected and placed 
correctly to capture the movements of the wearer's hands accurately. This is crucial for the accurate 
interpretation of sign language gestures. Secondly, an internet of things (IoT) platform must be connected with 
the GloSign glove to transmit data to a computer or mobile device. This allows for real-time interpretation of 
sign language gestures and makes communication between deaf or hard-of-hearing individuals and hearing 
individuals possible. 

Finally, the data collected from the GloSign glove is interpreted using machine learning algorithms, 
which have been trained on sign language datasets. These algorithms can recognize patterns in the movements 
of the hands and translate them into English language sentences or phrases. Through this process, GloSign is 
able to bridge the communication gap between deaf and hearing individuals, providing a more inclusive and 
accessible world. 


3.1. Selection and placement of the sensors 0 

This subsection defines the method used in selection and placement of the sensors. The sensors used 
are flex sensors, contact sensor and and inertial measurement unit (IMU) sensor. The IMU sensors consist of 
an accelerometer and a gyroscope. 

The flex sensors will be used to measure the angle at which fingers are bent. Depending on the angle 
the values of the flex’s resistance change. The ideal place would be to place them on top of the glove. The flex 
sensors will be connected to the Arduino using the diagram shown in Figure 2. The first pin of the flex sensor 
(red) is connected to 3.3v of Arduino NANO IoT, while the other pin (blue) is connected to a resistor. The 
connection before resistor is connected to the analog input of the Arduino, while the other connection (black) 
is grounded. 
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Figure 2. Flex sensor connection 


American sign language has multiple gestures that look very similar and is very hard to distinguish. 
The contact sensor will be used to differentiate similar gestures. The contact sensor will have similar connection 
to the Arduino as shown in Figure 2. The contact sensors are placed on the index and middle finger. This 
positioning of the sensor will help in differentiating among many of the signs. 

The IMU sensor will be used to detect the dynamic gestures. The IMU sensor is a part of the Arduino, 
so it will be placed on the top of the hand. Figure 3 shows the placement of all the sensors on the glove. The 
glove is connected to an IoT platform using the Wi-Fi on Arduino. The IoT platform used in this project is 
international business machines (IBM) Watson IoT platform. The Arduino sends the raw values to the IoT 
platform. These values consist of accelerometer, gyroscope, and flex sensor data. The platform can be used to 
track the changes when the gestures are changed with the help of scatter plot in the platform. Later these values 
will be extracted from the platform to the PC using IBM Watson IoT software development kit (SDK). 
Figure 3 shows the flow of data from the glove to the PC for further processing. 
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Figure 3. Sensor placement and flow of data 


3.2. Data from glove 

The data from the glove is uploaded on the IBM Watson IoT platform. This data needs to be extracted 
and used to determine the gestures. The data is extracted through the python SDK for IBM Watson IoT 
platform. The data consists of flex sensor data, contact sensor data, accelerometer data, gyroscope data and 
movement data. These sensor data are then mapped and calibrated onto various letters. The training of the 
system is offline. Therefore, a data set is generated and used for training a Machine-learning model. 

To improve the accuracy of the detection, a KNN algorithm is used. This machine learning algorithm 
helps in classification of the alphabets. In order to find the most optimal value of K for prediction of the signs, 
various values of K have been tested. K values ranging from | to 25 have been tested for every 1,000 iterations 
and the accuracy was recorded. At the end, the K values with the best accuracy and the least K value were 
chosen. The best value provides the best accuracy and the lowest K provides the best speed of the system. 

The model with the best K value could be deployed. This model can receive the data from the IBM 
Watson IoT platform and predict the closest letters. Because of similarity of the some of the gestures to each 
other, it is possible that duplicate or incorrect letters are predicted. However, this issue would be dealt with 
after the sentence formation process. 
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The letters predicted at this stage will then be used to form words and sentence. Spaces can be added 
to a sentence using an additional gesture. At the end of the sentence generation stage, the sentence is passed 
through a filter that will correct any erroneous words that have been generated during this process. If a word is 
identified as incorrect, then the filter will examine the letters with similar gestures to the letters used in the 
word to fix the word. This is done for every word in the sentence. After every word in the sentence is deemed 
correct, the whole sentence is passed through a grammar checker to verify if the sentence grammatically is 
correct. This process takes around 2-8 seconds to process, depending on the length of the sentence and is done 
by a process that is called the gesture fix algorithm. This process is responsible for fixing the words in the 
sentence and making sure the sentence is grammatically intelligible. The gesture fix algorithm processes the 
sentence as a whole. The advantage of this is that there is no delay in processing. If the system processes it 
word by word, it could cause delay and loose some letters when the previous word is being processed. 

After the sentence is generated, it will be displayed on the screen running the gesture recognition 
software and then the sentence will be converted to speech using the IBM Watson text to speech SDK. This 
program can be installed on any PC that has a stable internet connection and has the ability to run python 
software. However, there could be differences in the performance depending on the specification of the system. 


4. RESULTS AND DISCUSSION 

This section discusses the findings from the experiments conducted with the glove. The system is 
divided into three parts, the first part is uploading data on the IoT platform. The second part consists of 
analyzing and decoding the data. The final part is outputting the data. 


4.1. IoT platform and sensors 

The IoT platform used in this experiment is IBM Watson IoT platform. Arduino NANO IoT on the 
glove is connected to this platform, using Wi-Fi communication. The data from the glove is divided into two 
parts, the data from flex sensors and the data from contact sensor and IMU. The IMU and contact sensors 
outputs are Boolean. If the contact sensors are touching each other, the Arduino will register a reading of 1, 
else its 0. The IMU sensor built-in the Arduino NANO IoT is used to determine if the glove is being moved or 
not. If the glove is being moved it would register a reading of 1, else 0 would be sent. 

Table 1 shows the average readings of around 5,000 gestures. The contact and movement values range 
between 0 and 1. The contact value is 1 if contact is registered in the gesture. The movement value is | when 
a dynamic gesture is registered. The flex sensors have a range between 0 and 90°. The flex sensors F1, F2, F3, 
F4 and F5 represent fourth, third, second, first fingers and the Thumb respectively. The data depicted in the 
Table 1 shows that most of the gestures have very similar sensor values. For example, the gestures like “1” and 
“7” have very similar sensor values except for the value of the movement sensor. 


Table 1. Average sensor values for each gesture 


Alphabet Fl F2 F3 F4 F5 C M 
a 43.03 25.94 39.80 51.28 0.27 1 0 
b 0.55 -4.49 -3.45 -0.05 30.30 1 0 
c 19.27 30.81 52.50 33.57 2.92 1 0 
d 32.01 42.04 51.73 0.13 -1.50 0 0 
e 66.83 71.89 73.64 58.29 35.41 1 0 
f 5.98 0.80 1.22 69.99 5.40 0 0 
g 76.21 66.59 83.98 -0.50 4.54 0 0 
h 59.45 56.08 -4.44 -0.03 12.33 0 0 
i 3.45 79.17 65.58 79.62 17.40 1 0 
j 2.78 71.62 60.66 74.46 15.61 1 1 
k 70.55 62.95 -6.81 -0.66 -0.64 0 0 
1 83.53 73.88 77.67 -1.13 -0.82 0 0 
m 77.58 57.87 53.03 54.70 10.29 1 0 
n 90.52 71.54 58.45 61.96 -2.11 1 0 
to) 44.81 43.63 51.74 42.93 7.46 1 0 
p 57.93 49.62 2.35 -0.80 -3.45 0 0 
q 74.20 69.96 68.03 -0.21 -2.14 0 0 
r 60.91 47.69 -5.54 -1.10 6.88 0 0 
s 95.83 78.83 71.81 75.64 35.59 1 0 
t 76.92 34.22 40.41 52.19 -68 0 0 
u 82.84 31.52 -6.82 -L.1l 13.97 1 0 
v 77.50 38.18 -7.07 -0.76 14.15 0 0 
w 53.30  -4.28 -7.13 -0.47 12.56 0 0 
x 76.99 75.49 70.31 58.34 41.50 0 0 
y 1.06 74.33 68.31 80.23 -1.50 1 0 
Z 64.54 62.94 73.39 -0.73 7.17 0 1 
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Figure 4 shows how the data is visualized in the platform to aid in understanding different gestures. 
The C (contact sensor) and M (dynamic gesture) part shows the contact sensors and movement readings. The 
value C which in in turquoise color goes high when the contact sensors are touching each other. The value M, 
which is in light turquoise color, goes high when there is any movement in the glove. 
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Figure 4. Visualization of C and M from glove on IBM Watson IoT platform 


Figure 5 shows the flex sensor readings. The chart shows five sensor readings from flex sensors. Fl 
is the fourth finger, F2 is the third finger, F3 is the second finger, F4 is the first finger and F5 is the Thumb. 
the graph shows the reading for hand at rest. 
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Figure 5. Visualization of flex sensors from glove on IBM Watson IoT platform 


4.2. Analyzing and decoding data 

The data from the IoT platform is extracted for analyzes. The data is extracted using the python 
application programming interface (API) for IBM Watson IoT platform. The k-nearest neighbors (KNN) 
supervised machine learning model is used for classification of gestures. For training the model, 200 gestures 
were signed and the readings were recorded. The average value of these readings is shown in Table 1. After 
recording the readings, the gestures were classified using American sign language gestures and the 
classification were made available to KNN model. 
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To verify the accuracy of the system, the gesture readings were split into two parts; testing and 
training. 75% of the data were used as training set for KNN and 25% of data was used for testing and measuring 
the accuracy of the system. The bar chart in Figure 6 shows the accuracy for identifying each letter when K 
was set to | (1-NN). It can be seen from Figure 6 that majority of the gestures were identified with 100% 
accuracy. The dynamic gestures were identified with an accuracy less than 95%. The common mistake was in 
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telling apart letters “1” and “j”, as they have the same sensor values except for the value of the movement 


66599 


sensor. The letters “‘j” sets the movement sensor to 1. The issue is; when the movement finishes or the glove is 
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stationery, the “j” would be read as “i”. Aside from “j’”, there were discrepancies in identifying “h” and “r’, as 


they had quite similar sensor readings. But by far the most problematic letter was “j’”. 


Accuracy of the gestures 
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Figure 6. Accuracy of the gestures 


To increase the precision of the KNN algorithm, different values of K were tested. The mean error 
and accuracy were generated to determine which K would be best for the existing framework. Figure 7 shows 
the average mean errors for various values of K. The graph shows that the best values for K are K=1 and 3 with 
average error rate below 0.5%. 
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Figure 7. Mean error of K values 
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Figure 8 depicts the average accuracy of identifying gestures for various values of K. Similarly, 
Figure 8 shows that the system can achieve accuracy of more than 99.5% if the values of K are set to K=1, 
or 3. As the value of k increases it requires more computing power and time to analyze the data. So, the ideal 
choice would be to go with the lowest value of K that has acceptable accuracy and mean error rate. The ideal 
option here would be 1 for the value of K. 


Testing Accuracy 
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Figure 8. Accuracy of k values 


The glove shown in Figure 9 was tested using pangram “The quick brown fox jumps over a lazy dog”. 
This pangram would be a good test to determine the efficiency of the glove in forming sentences as the pangram 
contains all the alphabets. The glove was powered by a battery pack that is connected at the bottom of the 
glove. 


Figure 9. GloSign glove 
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Figure 10 shows the console output of the system while processing the gestures made for signing the 
pangram “The quick brown fox jumps over a lazy dog”. It is quite evident that there are some issues between 
“7? and “j” due to the movement of hands. The other issues are due to similar hand gestures. But this error is 
corrected by passing it through the gesture fix algorithm. In this case the algorithm was able to fix all the errors 
created. 


oO ~ Console 1A 


e the qujck brcwn fox jiumps ove 

2022-02-17 18:35:09,551 wiotp.sdk.application.client.ApplicationClient 

INFO Connected successfully: a:j@4fob:app1 

2022-02-17 18:35:09,551 wiotp.sdk.application.client.ApplicationClient 

ERROR Unexpected disconnect from IBM Watson IoT Platform: 1 

2022-02-17T14: 23:09.431000+00:00 Arduino: ABXe0e32 Connect 

109.177.248.240 
the gujck brcwn fox jiumps over 
the qujck brcwn fox jiumps over 
the qujck brcwn fox jiumps over 
the qujck brcwn fox jiumps over 
the gujck brcwn fox jiumps over 
the gujck brcwn fox jiumps over 
the qujck brewn fox jiumps over 
the qujck brcwn fox jiumps over 
the gujck brcwn fox jiumps over 
the gujck brcwn fox jiumps over 
the qujck brcwn fox jiumps over 
the qujck brcwn fox jiumps over 
the gqujck brcwn fox jiumps over 

processing sentence... 

The quick brown fox jumps over a lazy dog 


1 

1m 

1mz 

lmzy 
lmzy 
lmzy d 
lmzy do 
lmzy dog 
lmzy dog 


oooO OD HD MY YM D 


Figure 10. System output 


4.3. Outputting the data 

After the execution of the gesture fix program, the final sentence is displayed on the console as shown 
in Figure 10. It is also played using IBM Watson speech to text module. Figure 11 shows the conversion of the 
text to speech using IBM Watson text to speech module. The file is stored and played at the end of the program. 
The module can be modified to change dialects and voice. 


ce="en-US AllisonV3Voice’ dialect="American' 


ing speech from text... 
file stored as speech.mp3 


Figure 11. Speech to text 


4.4. Discussion 

In this paper, a glove for interpreting gestures of American Sign language has been discussed. 
Machine Learning and sentence level error correction has improved the output of the system for few letters 
with similar gestures. The glove could be further improved by utilizing additional contact sensors for 
identifying ambiguous gestures. The gesture fix algorithm could be optimized more to process faster. Analysis 
and correction at word level instead of sentence level could improve speed of the system. Similarly, further 
improvement could be achieved by guessing next words and sentence endings. The whole system can be placed 
on the IBM platform using node red. This would make the system easily accessible from anywhere and from 
any device. However, it can also affect the system, as processing online would be slower. Moreover, this project 
can be added to a video chatting software that can decode the gestures and display them on the screen. This 
could give a new experience of attending meetings for people who use sign language to communicate. 


5. CONCLUSION 
This paper proposes a glove called GloSign that translates the sign language gestures to letters and 
words. The system can also form sentences using the letters and words identified. This glove uses IMU and 
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flex sensor to decode the sign language gestures. These sensor data are transmitted to IBM Watson IoT 
platform. The KNN machine learning algorithm is used for distinguishing between difficult or similar gestures. 
The letters identified from the gestures, later, are combined to form sentences. These sentences are passed 
through another layer of error correction which is called gesture fix algorithm to resolve the mistakes in 
detecting letters at word and sentence level. Finally, the output of the system is displayed on screen and 
converted to speech for convenience. Further studies are necessary to improve the accuracy and speed of the 
system, so it can aid in verifying the sign language gestures more competently. 
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